Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Limit the size of aggregated WebXDC update to 100 KiB (#4825) #5490

Merged
merged 1 commit into from
Jul 13, 2024

Conversation

iequidoo
Copy link
Collaborator

@iequidoo iequidoo commented Apr 23, 2024

See commit messages.
Close #4825

@iequidoo iequidoo force-pushed the iequidoo/webxdc-update-limit-size branch from d4238cd to 0df1218 Compare April 24, 2024 00:02
@iequidoo iequidoo changed the title feat: Limit the size of aggregated WebXDC update (#4825) feat: Limit the size of aggregated WebXDC update to 100 KiB (#4825) Apr 24, 2024
@iequidoo iequidoo marked this pull request as ready for review April 24, 2024 00:14
@iequidoo iequidoo requested a review from link2xt April 24, 2024 00:15
@link2xt
Copy link
Collaborator

link2xt commented Apr 25, 2024

Also don't send any updates together with the WebXDC instance to not complicate the code, the only downside is sending one message more when resending or forwarding WebXDC instances.

What happens in case of message draft where sender wants to set up everything (e.g. a poll) before sending the message out? I think this should always be a single message so we are sure that second message is not lost.

In case of forwarding we don't send updates anyway.

@iequidoo
Copy link
Collaborator Author

iequidoo commented Apr 26, 2024

What happens in case of message draft where sender wants to set up everything (e.g. a poll) before sending the message out? I think this should always be a single message so we are sure that second message is not lost.

Makes sense, i'll change this back. But in case of resending it's easier not to send any updates together with the instance, but in separate messages of limited size.

In case of forwarding we don't send updates anyway.

True, and this is not changed, will fix the commit message.

EDIT: Done

@iequidoo iequidoo force-pushed the iequidoo/webxdc-update-limit-size branch 2 times, most recently from 8ddb466 to 24baa5c Compare April 26, 2024 07:57
@iequidoo iequidoo force-pushed the iequidoo/webxdc-update-limit-size branch from 24baa5c to 7136e48 Compare May 21, 2024 03:46
@iequidoo iequidoo force-pushed the iequidoo/webxdc-update-limit-size branch from 7136e48 to 5157fbc Compare June 12, 2024 18:32
@iequidoo iequidoo force-pushed the iequidoo/webxdc-update-limit-size branch 2 times, most recently from fc11367 to 6c19bd6 Compare June 27, 2024 03:11
Copy link
Collaborator

@link2xt link2xt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure about not sending any updates in resent webxdc. This makes resending small xdcs less reliable because sending two messages in a row can trigger ratelimits. Even sending 100kb + original .xdc without taking its size into account seems better than not sending any updates at all there, at least it is not worse than the current state. In many cases, e.g. games where updates are just a scoreboard, total size of updates never reaches 100kb, so it would be better if no splitting happens in these cases.

@@ -1381,8 +1392,14 @@ impl MimeFactory {
} else if msg.viewtype == Viewtype::Webxdc {
let topic = peer_channels::create_random_topic();
headers.push(create_iroh_header(context, topic, msg.id).await?);
if let Some(json) = context
.render_webxdc_status_update_object(msg.id, None)
if self.resend {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a comment inside the empty if, something like "Not sending WebXDC updates in the first part of resent WebXDC".

Copy link
Collaborator Author

@iequidoo iequidoo Jul 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed resend at all. Now updates are sent together with the instance on resending it.
EDIT: We can account also the instance size to respect the limit more accurately, but if so then better to do this in a separate commit.

Copy link
Collaborator Author

@iequidoo iequidoo Jul 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@link2xt If we want to account the instance size too, then it's a question whether we should do that only for resent instances, or initially too. And if we want to account the instance size only for resent instances, maybe better not to apply the size limit to initially sent instances at all to avoid unwanted message splitting and problems when the second message (containing initial updates) isn't delivered

src/chat.rs Outdated Show resolved Hide resolved
src/chat.rs Outdated Show resolved Hide resolved
src/chat.rs Outdated Show resolved Hide resolved
src/chat.rs Outdated
if range.0 > range.1 {
return Ok(());
};
// `first_serial` must be decreased to make parallel running
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "parallel running" mean here?

I am looking at SMTP loop, there send_smtp_messages flushes status updates, adding them at the end of SMTP queue, and then sends all queued messages.

Saying "to make Context::flush_status_updates() send the updates again." is easier to read unless I am missing something.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried to improve the comment. It's needed because it's not obvious why first_serial - 1 is here

src/webxdc.rs Outdated Show resolved Hide resolved
@iequidoo iequidoo force-pushed the iequidoo/webxdc-update-limit-size branch from 6c19bd6 to bd023c3 Compare July 10, 2024 02:38
@iequidoo iequidoo requested a review from link2xt July 10, 2024 02:46
src/webxdc.rs Outdated
/// Gets StatusUpdateSerial as untyped integer.
/// Avoid using this outside ffi.
pub fn to_u32(self) -> u32 {
self.0
}

/// Returns next [StatusUpdateSerial].
/// May `MAX + 1`, but this value mustn't be used as a status update id.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// May `MAX + 1`, but this value mustn't be used as a status update id.
/// May return `MAX + 1`, but this value mustn't be used as a status update id.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says that the value must not be used as a status update ID, but there is actually nothing preventing it from being used as status update ID. If .next() returns this status ID, it gets used. Seems the only reason for having MAX not equal to u32::MAX is to prevent render_webxdc_status_update_object failing because it calls next() on the last argument.

If there is actually no protection, then why not make next() use saturating_add and use u32::MAX for maximum? Then next() does not need to return Result. I tried, no tests failed:

index 55ba6dd70..dc8bb98e0 100644
--- a/src/webxdc.rs
+++ b/src/webxdc.rs
@@ -126,7 +126,7 @@ pub fn new(id: u32) -> StatusUpdateSerial {
     /// Minimum value.
     pub const MIN: Self = Self(1);
     /// Maximum value.
-    pub const MAX: Self = Self(u32::MAX - 1);
+    pub const MAX: Self = Self(u32::MAX);

     /// Gets StatusUpdateSerial as untyped integer.
     /// Avoid using this outside ffi.
@@ -136,12 +136,8 @@ pub fn to_u32(self) -> u32 {

     /// Returns next [StatusUpdateSerial].
     /// May `MAX + 1`, but this value mustn't be used as a status update id.
-    pub fn next(self) -> Result<StatusUpdateSerial> {
-        Ok(StatusUpdateSerial(
-            self.0
-                .checked_add(1)
-                .context("StatusUpdateSerial overflow")?,
-        ))
+    pub fn next(self) -> StatusUpdateSerial {
+        StatusUpdateSerial(self.0.saturating_add(1))
     }
 }

@@ -778,7 +774,7 @@ pub(crate) async fn render_webxdc_status_update_object(
                         }
                         json.push_str(&update_item);
                     }
-                    Ok((json, last.next()?))
+                    Ok((json, last.next()))
                 },
             )
             .await?;
@@ -1926,7 +1922,7 @@ async fn test_pop_status_update() -> Result<()> {
         let mut instances_checked = 0;
         for i in 0..3 {
             let (instance, min_ser, max_ser, descr) = t.smtp_status_update_get().await?.unwrap();
-            t.smtp_status_update_pop_serials(instance, min_ser, max_ser.next()?)
+            t.smtp_status_update_pop_serials(instance, min_ser, max_ser.next())
                 .await?;
             let min_ser: u32 = min_ser.try_into()?;
             if instance == instance1.id {

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed next() at all. It was only used in render_webxdc_status_update_object() and then indeed better to replace it with saturating_add() there.

(),
|row| {
let instance_id: MsgId = row.get(0)?;
let first_serial: StatusUpdateSerial = row.get(1)?;
let first_serial: i64 = row.get(1)?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this i64 now?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first_serial can be negative now because of the logic in chat::resend_msgs():

"INSERT INTO smtp_status_updates (msg_id, first_serial, last_serial, descr) \
VALUES(?, ?, ?, '') \
ON CONFLICT(msg_id) \
DO UPDATE SET first_serial=min(first_serial - 1, excluded.first_serial)",

Before, update sending might be delayed due to rate limits and later merged into large
messages. This is undesirable for apps that want to send large files over WebXDC updates because the
message with aggregated update may be too large for actual sending and hit the provider limit or
require multiple attempts on a flaky SMTP connection.

So, don't aggregate updates if the size of an aggregated update will exceed the limit of 100
KiB. This is a soft limit, so it may be exceeded if a single update is larger and it limits only the
update JSON size, so the message with all envelopes still may be larger. Also the limit may be
exceeded when updates are sent together with the WebXDC instance when resending it as the instance
size isn't accounted to not complicate the code. At least this is not worse than the previous
behaviour when all updates were attached.
@iequidoo iequidoo force-pushed the iequidoo/webxdc-update-limit-size branch from bd023c3 to a8ce09a Compare July 13, 2024 18:35
@iequidoo iequidoo requested a review from link2xt July 13, 2024 18:46
@iequidoo iequidoo merged commit 9996c2d into main Jul 13, 2024
37 checks passed
@iequidoo iequidoo deleted the iequidoo/webxdc-update-limit-size branch July 13, 2024 19:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Limit the size of aggregated WebXDC update
2 participants